Water Quality Assessment From Landsat Thematic Mapper Data Using Neural Network: An Approach to Optimal Band

نویسنده

  • Indrajeet Chaubey
چکیده

The concern about water quality in inland water bodies such as lakes and reservoirs has been increasing. Owing to the complexity associated with field collection of water quality samples and subsequent laboratory analyses, scientists and researchers have employed remote sensing techniques for water quality information retrieval. Due to the limitations of linear regression methods, many researchers have employed the artificial neural network (ANN) technique to decorrelate satellite data in order to assess water quality. In this paper, we propose a method that establishes the output sensitivity toward changes in the individual input reflectance channels while modeling water quality from remote sensing data collected by Landsat thematic mapper (TM). From the sensitivity, a hypothesis about the importance of each band can be made and used as a guideline to select appropriate input variables (band combination) for ANN models based on the principle of parsimony for water quality retrieval. The approach is illustrated through a case study of Beaver Reservoir in Arkansas, USA. The results of the case study are highly promising and validate the input selection procedure outlined in this paper. The results indicate that this approach could significantly reduce the effort and computational time required to develop an ANN water quality model. (KEY TERMS: artificial neural network; water quality; remote sensing; band combination.) Sudheer, K.P., Indrajeet Chaubey, and Vijay Garg, 2006. Lake Water Quality Assessment From Landsat Thematic Mapper Data Using Neural Network: An Approach to Optimal Band Combination Selection. Journal of the American Water Resources Association (JAWRA) 42(6):1683-1695. INTRODUCTION Observations of chlorophyll-a (chl-a) and suspended sediment (SS) concentration (and many other parameters) provide quantitative information concerning water quality conditions of a water body. Accordingly, these observations can be used in various numerical schemes to help characterize the trophic status of an aquatic ecosystem. However, the number of available in situ measurements of water quality characteristics is usually limited, especially in spatial and temporal domains, because of the high cost of data collection and laboratory analysis (Panda et al., 2004). As chl-a and SS are optically active water quality parameters, many researchers have employed the digital evaluation of remote sensing information at visible and near infrared (NIR) wavelengths to assess these water quality parameters in various water bodies (e.g., Ritchie et al., 1990; Lathrop, 1992; Choubey, 1994). Morel and Prieur (1977) classified water bodies into two types: Case I water body, or open ocean; and Case II water body, which is a coastal, estuary, or inland water body. In Case I, the major optically active constituent is chlorophyll, and the water does not have a great amount of suspended sediments. Consequently, the algorithms, though empirical, that relate sensor radiances to surface concentrations are effective, and the results are relatively good (Baruah et al., 2001). However, for Case II water bodies, the relationship between the sensor radiance and the water quality parameters becomes complex due to the interaction of many components such as chlorophyll, suspended 1Paper No. 05086 of the Journal of the American Water Resources Association (JAWRA) (Copyright © 2006). Discussions are open until June 1, 2007. 2Respectively, Visiting Scientist, Associate Professor, and Ph.D. Candidate, Department of Biological and Agricultural Engineering, University of Arkansas, 203 Engineering Hall, Fayetteville, Arkansas 72701 (Current address/Chaubey: Department of Agricultural and Biological Engineering, Purdue University, West Lafayette, Indiana 47907) (E-Mail/Chaubey: [email protected]). JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 1683 JAWRA JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION DECEMBER AMERICAN WATER RESOURCES ASSOCIATION 2006 LAKE WATER QUALITY ASSESSMENT FROM LANDSAT THEMATIC MAPPER DATA USING NEURAL NETWORK: AN APPROACH TO OPTIMAL BAND COMBINATION SELECTION1 K.P. Sudheer, Indrajeet Chaubey, and Vijay Garg2 sediments, and yellow substance, since all of them may be present in high concentrations. There is considerable scattering (even in NIR) from inland waters with high sediments (Baruah et al., 2001). Accordingly, the relationship between the sensor data and the constituent concentration become highly nonlinear, and most of the models currently in use that are based on linear regression or on principal component analysis often fail to accurately simulate constituent concentration under such conditions (Lathrop, 1992). In this context, data driven models may be preferable that can discover relationships from input-output data even when the user does not have a complete physical understanding of the system. In recent decades, the advent of increasingly efficient computing technology has provided exciting new tools for the mathematical modeling of dynamic systems. The ANN is one such tool that relates a set of predictor variables to a set of target variables. Artificial neural networks are well known, massively parallel computing models that have exhibited excellent performance in the resolution of complex problems in science and engineering. In recent years, the ANN technique, which is a data driven modeling tool, has become an increasingly popular tool for water quality modeling among researchers and practicing engineers (e.g., Keiner and Yan., 1998; Gross et al., 1999; Tanaka et al., 2000, Baruah et al., 2001; Panda et al., 2004). Despite a plethora of studies on water quality modeling from remotely sensed data using ANN, there are still certain issues that are seldom addressed by the researchers. For instance, besides the fundamental question of defining an adequate neural topology, choosing the right set of input variables for approximating a function by a neural network still remains an unsatisfactorily resolved question. In remote sensing applications where correlated data of many bands are available, the selection of potential influencing variables becomes a challenge to the researchers. Constructing models such as ANN from data with nontrivial dynamics involves the problem of how to choose the best model from within a class of models or how to choose among the competing classes. The model selection problem involves selecting k nonzero elements (λ, the parameters of the model) in a given nonlinear model, g(x,λ). Following the principle of parsimony, the smallest network that adequately captures the relationships in the training data could be considered a good model (Morgan et al., 2000; Sudheer, 2000). It is observed that in most of the reported ANNbased models for water quality from remote sensing data, the input information was selected either by a trial-and-error procedure or arbitrarily. The trial-anderror procedure involves considerable computational time and requires the development and assessment of a number of models. When building models such as ANNs, it is natural to assume that having more information is better than having less. Instinctively, one might think that ANN would work better if more inputs are presented because the input vector contains all the vital information. However, in practice, this is not the case, particularly when multivariate models are developed using ANN, because inclusion of any spurious input may significantly increase the learning complexity and lead to reduced performance of the models. Our focus in this paper is to propose an analytical approach to identify the appropriate combination of input variables (remote sensing band data) while developing ANN water quality models from remote sensing data. We also illustrate the impact the incorporation of spurious input variables has on the performance of ANN water quality models that use spectral reflectance. The central idea of the proposed method is that, based on the output sensitivity toward changes in the individual input reflectance channels (wavelength bands), a hypothesis about the importance of each band can be made and used as a guideline for selecting the appropriate input variables for modeling. We illustrate the proposed approach through a case study of Beaver Reservoir in Arkansas, USA. ARTIFICIAL NEURAL NETWORK An ANN attempts to mimic, in a very simplified way, human mental and neural structures and functions (Hsieh, 1993). It can be characterized as massively parallel interconnections of simple neurons that function as a collective system. The network topology consists of a set of nodes (neurons) connected by links and usually is organized in a number of layers. Each node in a layer receives and processes weighted input from previous layer and transmits its output to nodes in the following layer through links. Each link is assigned a weight, which is a numerical estimate of the connection strength. The weighted summation of inputs to a node is converted to an output according to a transfer function (typically a sigmoid function). Most ANNs have three layers or more: an input layer, which is used to present data to the network; an output layer, which is used to produce an appropriate response to the given input; and one or more intermediate layers, which are used to act as a collection of feature detectors (Figure 1). The multilayer perceptron (MLP) is the most popular ANN architecture in use today (Dawson and Wilby, 1998). It assumes that the unknown function JAWRA 1684 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION SUDHEER, CHAUBEY, AND GARG

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Road Identification in Landsat Thematic Mapper Imageryusing Pulse-coupled Neural Networks: an Initialassessment

Classifying roads in remotely sensed imagery has been addressed by a number of research efforts. Detecting these features is important for a variety of endeavors such as agricultural assessment and urban planning. This study investigates the viability of using a pulse-coupled neural network to recognize roads in Landsat-4 Thematic Mapper multispectral imagery.

متن کامل

Comments on "Water Quality Retrievals From Combined Landsat TM Data and ERS-2 SAR Data in the Gulf of Finland"

This paper presents the applicability of combined Landsat Thematic Mapper and European Remote Sensing 2 synthetic aperture radar (SAR) data to turbidity, Secchi disk depth, and suspended sediment concentration retrievals in the Gulf of Finland. The results show that the estimated accuracy of these water quality variables using a neural network is much higher than the accuracy using simple and m...

متن کامل

Local Search for Optimal Global Map Generation Using Mid-Decadal Landsat Images

NASA and the US Geological Survey (USGS) are seeking to generate a map of the entire globe using Landsat 5 Thematic Mapper (TM) and Landsat 7 Enhanced Thematic Mapper Plus (ETM+) sensor data from the “mid-decadal” period of 2004 through 2006. The global map is comprised of thousands of scene locations and, for each location, tens of different images of varying quality to chose from. Furthermore...

متن کامل

Automated Method for Monitoring Water Quality Using Landsat Imagery

Regular monitoring of water quality is increasingly necessary to keep pace with rapid environmental change and protect human health and well-being. Remote sensing has been suggested as a potential solution for monitoring certain water quality parameters without the need for in situ sampling, but universal methods and tools are lacking. While many studies have developed predictive relationships ...

متن کامل

Utility of Landsat Thematic Mapper Data for Mapping Site Productivity in Tropical Moist Forests

Regression analysis was used to develop a relationship between Landsat 5 Thematic Mapper data and site quality of permanent sample plots. The thermal band (6) produced misleading results, but the ratio of band 4 on 5 showed promise when combined with geological information. Best results were obtained when regression analysis was restricted to ensure that at least one each of the visible (1 to 3...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006